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Abstract 

Recent results in Compressive Sensing have shown that, under certain conditions, 
the solution to an underdetermined system of linear equations with sparsity -based 
regularization can be accurately recovered by solving convex relaxations of the 
original problem. In this work, we present a novel primal-dual analysis on a class 
of sparsity minimization problems. We show that the Lagrangian bidual (i.e., 
the Lagrangian dual of the Lagrangian dual) of the sparsity minimization prob- 
lems can be used to derive interesting convex relaxations: the bidual of the £q- 
minimization problem is the l\ -minimization problem; and the bidual of the 
minimization problem for enforcing group sparsity on structured data is the £i )0o - 
minimization problem. The analysis provides a means to compute per-instance 
non-trivial lower bounds on the (group) sparsity of the desired solutions. In a real- 
world application, the bidual relaxation improves the performance of a sparsity- 
based classification framework applied to robust face recognition. 



1 Introduction 

The last decade has seen a renewed interest in the problem of solving an underdetermined system of 
equations Ax = b, A £ K mx ™, b £ R m , where m << n, by regularizing its solution to be sparse, 
i.e., having very few non-zero entries. Specifically, if one aims to find x with the least number of 
nonzero entries that solves the linear system, the problem is known as ^-minimization: 

(Pq) : Xq = argmin ||x||o s.t. Ax = b. (1) 

The problem (P ) is intended to seek entry-wise sparsity in x and is known to be NP-hard in general. 
In Compressive Sensing (CS) literature, it has been shown that the solution to (1) often can be 
obtained by solving a more tractable linear program, namely, l\ -minimization [4, 8]: 

(Pi): x\ = argmin || x\\ i s.t. Ax = b. (2) 

This unconventional equivalence relation between (Po) an d (P.) an d the more recent numerical 
solutions [3, 1 6] to efficiently recover high-dimensional sparse signal have been a very competitive 
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research area in CS. Its broad applications have included sparse error correction [ ], compressive 
imaging [23], image denoising and restoration [1 1,17], and face recognition [13,21], to name a few. 

In addition to enforcing entry-wise sparsity in a linear system of equations, the notion of group 
sparsity has attracted increasing attention in recent years [12, 13, 18]. In this case, one assumes that 
the matrix A has some underlying structure, and can be grouped into blocks: A = \A\ ■ ■ ■ Ax), 
where Ak <E Jj mxd fc and Ylk=i = n. Accordingly, the vector x is split into several blocks as 
x T = \xj . . . xJA , where Xk £ K dfc ■ In this case, it is of interest to estimate x with the least 
number of blocks containing non-zero entries. The group sparsity minimization problem is posed as 



K 



(Po, P ) : x* Qp = argminy^ l(\\x k \\ p > 0), s.t. Ax = [A\ ■■■ A K ] 



k= 



x K 



b, (3) 



where I(-) G M is the indicator function. Since the expression J^fcLi-^dl^fellp > 0) can be written 



as 



\xi\\ p ■ ■ ■ \\xk\\p] \\o, it is also denoted as Iq <p (x), the 4).p-norm of x. 



Enforcing group sparsity exploits the problem's underlying structure and can improve the solution's 
interpretability. For example, in a sparsity-based classification (SBC) framework applied to face 
recognition, the columns of A are vectorized training images of human faces that can be naturally 
grouped into blocks corresponding to different subject classes, b is a vectorized query image, and 
the entries in x represent the coefficients of linear combination of all the training images for recon- 
structing b. Group sparsity lends itself naturally to this problem since it is desirable to use images 
of the smallest number of subject classes to reconstruct and subsequently classify a query image. 

Furthermore, the problem of robust face recognition has considered an interesting modification 
known as the cross-and-bouquet (CAB) model: b = Ax + e, where e E W l represents possi- 
ble sparse error corruption on the observation b [ ]. It can be argued that the CAB model can be 
solved as a group sparsity problem in (3), where the coefficients of e would be the (K + l) th group. 
However, this problem has a trivial solution for e = b and x = 0, which would have the smallest 
possible group sparsity. Hence, it is necessary to further regularize the entry-wise sparsity in e. 

To this effect, one considers a mixture of the previous two cases, where one aims to enforce entry- 
wise sparsity as well as group sparsity such that x has very few number of non-zero blocks and the 
reconstruction error e is also sparse. The mixed sparsity minimization problem can be posed as 



(MP ,p) : {aso.p? e o) = argmin £ ,2>(^) + 7ll e l|o, s.t. [Ai ■■■ A K ] 

(as,e) 



Xl 



b + e, (4) 



_x K _ 

where 7 > controls the tradeoff between the entry -wise sparsity and group sparsity. 



Due to the use of the counting norm, the optimization problems in (3) and (4) are also NP-hard in 
general. Hence, several recent works have focused on developing tractable convex relaxations for 
these problems. In the case of group sparsity, the relaxation involves replacing the ^o,p- norm with 

the ^i !P -norm, where l\. p {x) = ||[||aJi||p ■•• II^kHp]^ = SifcLi ll^felly These relaxations are 
also used for the mixed sparsity case [13]. 

In this work, we are interested in deriving and analyzing convex relaxations for general sparsity min- 
imization problems. In the entry-wise case, the main theoretical understanding of the link between 
the original NP-hard problem in (1) and its convex relaxation has been given by the simple fact that 
the ^i-norm is a convex surrogate of the £o- norm - However, in the group sparsity case, a similar 
relaxation produces a family of convex surrogates, i.e., £i tP (x), whose value depends on p. This 
raises the question whether there is a preferable value of p for the relaxation of the group sparsity 
minimization problem? In fact, we consider the following more important question: 

Is there a unified framework for deriving convex relaxations of general sparsity recovery problems? 
1.1 Paper contributions 

We present a new optimization-theoretic framework based on Lagrangian duality for deriving convex 
relaxations of sparsity minimization problems. Specifically, we introduce a new class of equivalent 
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optimization problems for (Po), (Po,p) an d (MPo )P ), and derive the Lagrangian duals of the original 
NP-hard problems. We then consider the Lagrangian dual of the Lagrangian dual to get a new 
optimization problem that we term as the Lagrangian bidual of the primal problem. We show 
that the Lagrangian biduals are convex relaxations of the original sparsity minimization problems. 
Importantly, we show that the Lagrangian biduals for the (Po) an d (P).p) problems correspond to 
minimizing the ^-norm and the ti i00 -norm, respectively. 

Since the Lagrangian duals for (Po), (Po,p) and (MPq )P ) are linear programs, there is no duality 
gap between the Lagrangian duals and the corresponding Lagrangian biduals. Therefore, the bidual 
based convex relaxations can be interpreted as maximizing the Lagrangian duals of the original 
sparsity minimization problems. This provides new interpretations for the relaxations of sparsity 
minimization problems. Moreover, since the Lagrangian dual of a minimization problem provides a 
lower bound for the optimal value of the primal problem, we show that the optimal objective value 
of the convex relaxation provides a non-trivial lower bound on the sparsity of the true solution to the 
primal problem. 



2 Lagrangian biduals for sparsity minimization problems 

In what follows, we will derive the Lagrangian bidual for the mixed sparsity minimization problem, 
which generalizes the entry-wise sparsity and group sparsity cases (also see Section 3). Specifically, 
we will derive the Lagrangian bidual for the following optimization problem: 

K 

x* = argmin [a k l(\\x k \\ p > 0) + /3fc||^fc||o] , s.t. [A x ■■■ A K ] 
x fe=i 

where Vfc = 1,...,K : a k > and fit > 0. Given any unique, finite solution x* to (5), there 
exists a constant M > such that the absolute values of the entries of x* are less than M, namely, 
\\ x * Woo < M. Note that if (5) does not have a unique solution, it might not be possible to choose a 
finite-valued M that upper bounds all the solutions. In this case, a finite-valued M may be viewed 
as a regularization term for the desired solution. To this effect, we consider the following modified 
version of (5) where we introduce the box constraint that ||£c||oo < M: 

K 

^primal = argmin ^ [a fc X(||a: fe ||p > 0) + /3 fc ||a; fe ||o] , s.t. Ax = band < M, (6) 

x k=i 

where M is chosen as described above to ensure that the optimal values of (6) and (5) are the same. 

Primal problem. We will now frame an equivalent optimization problem for (6), for which we 
introduce some new notation. Let z € {0, l} n be an entry-based sparsity indicator for x, namely, 
Zi = if Xi = and z, ; = 1 otherwise. We also introduce a group-based sparsity indicator vector 
g G {0, 1} K , whose fc th entry denotes whether the k th block Xk contains non-zero entries or not, 
namely, gp. = if x k — and cjk = 1 otherwise. To express this constraint, we introduce a matrix 
II € {0, l} nxK , such that U id = 1 if the i th entry of x belongs to the j th block and 11^ = 
otherwise. Finally, we denote the positive component and negative component of x as x + > and 
cc_ > 0, respectively, such that x = x + — 

Given these definitions, we see that (6) can be reformulated as 

{x* + ,x*_,z*,g*}= argmin [a T g + (3 T z], s.t. (a) x+ > 0, (b) x_ > 0, (c) g G {0, 1} K , 

{x + ,x-,z,g} 

(d) z G {0, 1}" (e) A(x+ - x-) = b, (f) Ug >j^(x+ + b_), and (g) z > j^(x + + aj_), 

(7) 

where a = [a x • • • a k ] T G R k and (3 = [■ • • (3 k ■ ■ ■ (3 k ■ ■ ■ ] T G K". 

dk times 

Constraints (a)-(d) are used to enforce the aforementioned conditions on the values of the solution. 
While constraint (e) enforces the condition that the original system of linear equations is satisfied, 



xi 



x K 



(5) 
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the constraints (f) and (g) ensure that the group sparsity indicator g and the entry-wise sparsity 
indicator z are consistent with the entries of x. 



Lagrangial dual. The Lagrangian function for (7) is given as 

L(x + ,x_,z,g, Ai, A 2 , A 3 , A 4 , A 5 ) = a T g + (3 T z — \Jx + — \Jx_ + \J (b— Ax + + Ax_) 

+ X i(jj( x + + x -) - U 9) + X I(jf( x + + x -) - z )> 

(8) 

where Ai > 0, A2 > 0, A4 > 0, and A5 > 0. In order to obtain the Lagrangian dual function, we 
need to minimize L(-) with respect to x + , x_, g and z [ ]. Notice that if the coefficients of x + and 
cc_, i.e., j^f (A4 + Ag) — A T A3 — Ai and (A 4 + A5) + A T X 3 — A2 are non-zero, the minimization 
of L(-) with respect to x + and x^ is unbounded below. To this effect, the constraints that these 
coefficients are equal to form constraints on the dual variables. Next, consider the minimization 
of L(-) with respect to g. Since each entry g k only takes values or 1, its optimal value gk that 
minimizes L(-) is given as 

= |0 if a k - (II T A 4 ) fc > 0, and 
® k \ 1 otherwise. 

A similar expression can be computed for the minimization with respect to z. As a consequence, 
the Lagrangian dual problem can be derived as 

{A*}f =1 = argmaxfAjb + 1 T min{0, a - n T A 4 } + 1 T mm{0,/3 - A 5 }] , s.t. 

(a)Vi=l,2,4,5:Ai>0, (b) ^(A 4 + A 5 ) - A T X 3 - A x = (10) 

and (c) ^(A 4 + A 5 ) + A T \ 3 - A 2 = 0. 

This can be further simplified by rewriting it as the following linear program: 

{A- }J=3 = argmaxfAjb + 1 T A 6 + 1 T A 7 ] , s.t. (a) A 4 > 0, (b) A 5 > 0, (c) A 6 < 0, (d) A 7 < 0, 

(e) A 6 < a - n T A 4 , (f) A 7 < - Ag and (g) - — (A 4 + A 5 ) < ^ T A 3 < — (A 4 + Ag). 

Notice that we have made two changes in going from (10) to (11). First, we have replaced constraints 
(b) and (c) in (10) with the constraint (g) in (11) and eliminated Ai and A2 from (1 1). Second, we 
have introduced variables A 6 and A 7 to encode the "min" operator in the objective function of (10). 

Lagrangian bidual. We will now consider the Lagrangian dual of (1 1), which will be referred to as 
the Lagrangian bidual of (7). It can be verified that the Lagrangian dual of (1 1) is given as 

{x* + ,x*_,z*,g*}= argmin a T g + /3 T z s.t. (a) x + > 0, (b) a;_ > 0, (c) g E [0, l] K , 

{«+,--,«,*} ' ^ (12) 

(d) z G [0, 1]" (e) A(x + - x-) = b, (f) Ug > j^(x+ + sb_) and (g) z > j^(x + + x_). 

Notice that in going from (7) to (12), the discrete valued variables z and g have been relaxed to take 
real values between and 1. Given that z < 1 and noting that x can be represented as x = x + — X- , 
we can conclude from constraint (g) in (12) that the solution x* satisfies ||aj*||oo < M. Moreover, 
given that g and z are relaxed to take real values, we see that the optimal values for gl and z* are 
XT Halloo an d jj \x* |, respectively. Hence, we can eliminate constraints (f) and (g) by replacing z 
and g by these optimal values. It can then be verified that solving (12) is equivalent to solving the 
problem: 

1 K 

^bidual = argmin— V [ajfeHajfcHoo + ^k||ac fc ||i] s.t. (a) Ax = b and (b) < M. (13) 

x M k=i 

This is the Lagrangian bidual for (7). 
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3 Theoretical results from the biduality framework 



In this section, we first describe some properties of the biduality framework in general. We will then 
focus on some important results for the special cases of entry-wise sparsity and group sparsity. 

Theorem 1. The optimal value of the Lagrangian bidual in (13) is a lower bound on the optimal 
value of the NP-hard primal problem in (7). 

Proof. Since there is no duality gap between a linear program and its Lagrangian dual [ ], the opti- 
mal values of the Lagrangian dual in (1 1) and the Lagrangian bidual in (13) are the same. Moreover, 
we know that the optimal value of a primal minimization problem is always bounded below by the 
optimal value of its Lagrangian dual [2]. We hence have the required result. □ 

Remark 1. Since the original primal problem in (7) is NP-hard, we note that the duality gap be- 
tween the primal and its dual in (11) is non-zero in general. Moreover, we notice that as we increase 
M (i.e., a more conservative estimate), the optimal value of the primal is unchanged, but the optimal 
value of the bidual in (13) decreases. Hence, the duality gap increases as M increases. 

M in (6) should preferably be equal to 1 1 a;* rimal || oo > which may not be possible to estimate accurately 
in practice. Therefore, it is of interest to analyze the effect of taking a very conservative estimate of 
M, i.e., choosing a large value for M. In what follows, we show that taking a conservative estimate 
of M is equivalent to dropping the box constraint in the bidual. 

For this purpose, consider the following modification of the bidual: 

K 

^bidud-consei-vative = argmin ^ [a k 1 1 x k |j oo + #fc||a; fe ||i] s.t. Ax = b, (14) 
x k=i 

where we have essentially dropped the box constraint (b) in (13). It is easy to verify that VM > 

max { 1 1 ^primal I loo, II ^bidual-conservative I! oo }' We naVe mat ^bidual = ^bidual-conservative- Therefore, We See 

that taking a conservative value of M is equivalent to solving the modified bidual in (14). 
3.1 Results for entry-wise sparsity minimization 

Notice that by substituting «! = •••= ax = and ft = • • • = Pk = L the optimization problem 
in (5) reduces to the entry-wise sparsity minimization problem in (1). Hence, the Lagrangian bidual 
to the A/-regularized entry-wise sparsity problem (Pq) is: 

^entry-wise-biduai = argmin— ||x|| i s.t. (a) Ax = b and (b) Hxlloo < M. (15) 

More importantly, we can also conclude from (14) that solving the Lagrangian bidual to the entry- 
wise sparsity problem with a conservative estimate of M is equivalent to solving the problem: 

^entry-wise-bidual-conservative = &rgrnin||x||i s -t- Ax = 6, (16) 

x 

which is precisely the well-known ^-norm relaxation for (Po)- Our framework therefore provides 
a new interpretation for this relaxation: 

Remark 2. The £i-nonn minimization problem in (16) is the Lagrangian bidual of the C^-norm 
minimization problem in (1), and solving (16) is equivalent to maximizing the dual of ( I). 

We further note that we can now use the solution of ( 1 5) to derive a non-trivial lower bound for the 
primal objective function which is precisely the sparsity of the desired solution. More specifically, 
we can use Theorem 1 to conclude the following result: 

Corollary 1. Let Xq be the solution to (1). We have that^M > ||:Eq||oo, the sparsity of Xq, i.e., 
\\x* \\ is bounded below by jj\\x; ntry _ wise _ bjdu J\i. 

Due to the non-zero duality gap in the primal entry-wise sparsity minimization problem, the above 
lower bound provided by Corollary 1 is not tight in general. 
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3.2 Results for group sparsity minimization 

Notice that by substituting ct\ = ■ ■ ■ = olk = 1 and [3i = ■ ■ ■ = Pk = 0, the optimization problem 
in (5) reduces to the group sparsity minimization problem in (3). Hence, the Lagrangian bidual of 
the group sparsity problem is: 

1 K 

^group-biduai = argmin— V ||xfe||«) s.t. (a) Ax = b and (b) \\x\loo < M. (17) 
x M k=i 

As in the case of entry-wise sparsity above, solving the bidual to the group sparsity problem with a 
conservative estimate of M is equivalent to solving: 

K 

^group-bidual-conservarive ~ argmin ^ ||gfc||oo s -t- Ax = b, (18) 

x k=l 

which is the convex i'l.oo-norm relaxation of the £o. p -min problem (3). In other words, the biduality 
framework selects the £i !00 -norm out of the entire family of i\ p -norms as the convex surrogate of 
the 4).p- norm - 

Finally, we use Theorem 1 to show that the solution obtained by minimizing the ^i i00 -norm provides 
a lower bound for the group sparsity. 

Corollary 2. Let x*§ p be the solution to (3). For any M > \ \ Xq \ | oo, the group sparsity of Xq p , i.e., 

£oA x o, P )- is bounded below by ^ii,oo(x* mup . bidual ). 

The £i j00 -norm seems to be an interesting choice for computing the lower bound of the group spar- 
sity, as compared to other ^i p -norms for finite p < oo. For example, consider the case when 
p = 1, where the ^i p -norm is equivalent to the £i-norm. Assume that A consists of a single block 
with several columns so that the maximum number of non-zero blocks is 1. Denote the solution to 
the ^-minimization problem as x*. It is possible to construct examples (also see Figure 1) where 
jjli.i(xl) = jj£i(x*) > 1. Hence, it is unclear in general if the solutions obtained by minimizing 
£i p -norms for finite-valued p < oo can help provide lower bounds for the group sparsity. 

4 Experiments 

We now present experiments to evaluate the bidual framework for minimizing entry-wise sparsity 
and mixed sparsity. We present experiments on synthetic data to show that our framework can be 
used to compute non-trivial lower bounds for the entry-wise sparsity minimization problem. We 
then consider the face recognition problem where we compare the performance of the bidual-based 
^i,oo-norm relaxation with that of the fi^-norm relaxation for mixed sparsity minimization. 

We use boxplots to provide a concise representation of our results' statistics. The top and bottom 
edge of a boxplot for a set of values indicates the maximum and minimum of the values. The bottom 
and top extents of the box indicate the 25 and 75 percentile mark. The red mark in the box indicates 
the median and the red crosses outside the boxes indicate potential outliers. 

Entry-wise sparsity. We now explore the practical implications of Corollary 1 through synthetic 
experiments. We randomly generate entries of A £ j^i28x256 an( j Xq g R 256 from a Gaussian 
distribution with unit variance. The sparsity of x is varied from 1 to 64 in steps of 3. We solve (15) 
with b = Axq using M = Mo, 2Mo and 5Mq, where Mq — ||:eq||oo- We use Corollary 1 to compute 
lower bounds on the true sparsity, i.e., || a?o II o- We repeat this experiment 1000 times for each sparsity 
level and Figure 1 shows the boxplots for the bounds computed from these experiments. 

We first analyze the lower bounds computed when M — Mq, in Figure 1(a). As explained in Section 
3.1, the bounds are not expected to be tight due to the duality gap. Notice that for extremely sparse 
solutions, the maximum of the computed bounds is close to the true sparsity but this diverges as the 
sparsity of Xq reduces. The median value of the bounds is much looser and we see that the median 
also diverges as the sparsity of Xq reduces. Furthermore, the computed lower bounds seem to grow 
linearly as a function of the true sparsity. Similar trends are observed for M = 2Mq and 5Mq 
in Figures 1(b) and 1(c), respectively. As expected from the discussion in Section 3.1, the bounds 
become very loose as M increases. 
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Figure 1: Results for computing the lower bounds on the true (black lines) entry-wise sparsity 
||a? ||o obtained over 1000 trials. The bounds are computed by solving (15) and using Corollary 1 
with M = Mo, 2Mq and 5Afo, where Mo — ||a;o||oo- Notice that as expected from the discussion in 
Section 3.1, the bounds are not tight due to the duality gap and become looser as M increases. 



In theory, we would like to have per-instance certificates- of- optimality of the computed solution, 
where the lower bound is equal to the true sparsity ||a; ||o- Nonetheless, we note that this ability 
to compute a per-instance non-trivial lower bound on the sparsity of the desired solution is an im- 
portant step forward with respect to the previous approaches that require pre-computing optimality 
conditions for equivalence of solutions to the ^ - norm an d ^i-norm minimization problems. 

We have performed a similar experiment for the group sparsity case, and observed that the bidual 
framework is able to provide non-trivial lower bounds for the group sparsity also. 

Mixed sparsity. We now evaluate the results of mixed sparsity minimization for the sparsity-based 
face recognition problem, where the columns of A represent training images from the K face classes: 
j4i, • • • , Ak and b £ E m represents a query image. We assume that a subset of pixel values in the 
query image may be corrupted or disguised. Hence, the error in the image space is modeled by a 
sparse error term e: b = bo + e, where bo is the uncorrupted image. A linear representation of the 
query image forms the following linear system of equations: 

b = Ax + e=[A 1 ■■■ A K I][ X J ■■■ x\ e T ] T , (19) 

where / is the mxm identity matrix. The goal of sparsity-based classification (SBC) is to minimize 
the group sparsity in x and the sparsity of e such that the dominant non-zero coefficients in x reveal 
the membership of the ground-truth observation 6o = b — e [13, 21]. In our experiments, we solve 
for x and e by solving the following optimization problem: 

K 

{x* Up ,el} = argmin^ \\x k \\ P + 7ll e lli s.t. Ax + e = b. (20) 

Notice that for p = oo, this reduces to solving a special case of the problem in (14), i.e., the bidual 
relaxation of the mixed sparsity problem with a conservative estimate of M. In our experiments, we 
set 7 = 0.01 and compare the solutions to (20) obtained using p = 2 and p = oo. 

We evaluate the algorithms on a subset of the AR dataset [ ] which has manually aligned frontal 
face images of size 83 x 60 for 50 male and 50 female subjects, i.e., K = 100 and m = 4980. 
Each individual contributes 7 un-occluded training images, 7 un-occluded testing images and 12 
occluded testing images. Hence, we have 700 training images and 1900 testing images. To compute 
the number of non-zero blocks in the coefficient x estimated for a testing image, we find the number 
of blocks whose energy ^(xk) is greater than a specified threshold. 

The results of our experiments are presented in Figure 2. The solution obtained with p = 2 gives 
better group sparsity of x. However, a sparser error e is estimated with p = oo. The number of 
non-zero entities in a solution to (20), i.e., the number of non-zero blocks plus the number of non- 
zero error entries, is lower for the solution obtained using p = oo rather than that obtained using 
p = 2. However, the primal mixed-sparsity objective value £o tP (x) + 7||e||o (see (4)) is lower for 
the solution obtained using p = 2. 
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Figure 2: Comparison of mixed sparsity of the solutions to (20) for p — 2 and p = oo. We present 
boxplots for group sparsity of x and entry-wise sparsity of e. The differences are calculated as (# 
non-zero blocks/elements for p = 2) - non-zero blocks/elements for p = oo). We see that for 
p = 2 we get better group sparsity of x, but we get a more sparse error e when we use p = oo. 





P = 


= 2 


p = 


oo 




^(correct results) 


%(correct results) 


# (correct results) 


% (correct results) 


un-occluded 


655 


93.57% 


663 


94.71% 


occluded 


643 


53.58% 


691 


57.58% 


total 


1298 


68.32% 


1324 


69.68% 



Table 1: Classification results on the AR dataset using the solutions obtained by minimizing mixed 
sparsity. The test set consists of 700 un-occluded images and 1200 occluded images. 



We now compare the classification results obtained with the solutions x computed in our experi- 
ments. For classification, we consider the non-zero blocks in x and then assign the query image to 
the block, i.e., subject class, for which it gives the least t% residual ||6 — Afeajfc || 2. The results are 
presented in Table 1 . Notice that the classification results obtained with p = 00 (the bidual relax- 
ation) are better than those obtained using p = 2. Since the classification of un-occluded images is 
already very good using p = 2, classification with p = 00 gives only a minor improvement in this 
case. However, a more tangible improvement is noticed in the classification of the occluded images. 
Therefore the classification with p = 00 is in general better than that obtained with p = 2, which is 
considered the state-of-the-art for sparsity-based classification [13]. 



5 Discussion 

We have presented a novel analysis of several sparsity minimization problems which allows us to 
interpret several convex relaxations of the original NP-hard primal problems as being equivalent to 
maximizing their Lagrangian duals. The pivotal point of this analysis is the formulation of mixed- 
integer programs which are equivalent to the original primal problems. While we have derived the 
biduals for only a few sparsity minimization problems, the same techniques can also be used to 
easily derive convex relaxations for other sparsity minimization problems [ ]. 

An interesting result of our biduality framework is the ability to compute a per-instance certificate of 
optimality by providing a lower bound for the primal objective function. This is in contrast to most 
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previous research which aims to characterize either the subset of solutions or the set of conditions 
for perfect sparsity recovery using the convex relaxations [5, 6, 8-10, 14, 15, 20]. In most cases, the 
conditions are either weak or hard to verify. More importantly, these conditions needed to be pre- 
computed as opposed to verifying the correctness of a solution at run-time. In lieu of this, we hope 
that our proposed framework will prove an important step towards per-instance verification of the 
solutions. Specifically, it is of interest in the future to explore tighter relaxations for the verification 
of the solutions. 
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